21 research outputs found

    Inferring stabilizing mutations from protein phylogenies : application to influenza hemagglutinin

    Get PDF
    One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΔΔG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution

    DECK: Distance and environment-dependent, coarse-grained, knowledge-based potentials for protein-protein docking

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Computational approaches to protein-protein docking typically include scoring aimed at improving the rank of the near-native structure relative to the false-positive matches. Knowledge-based potentials improve modeling of protein complexes by taking advantage of the rapidly increasing amount of experimentally derived information on protein-protein association. An essential element of knowledge-based potentials is defining the reference state for an optimal description of the residue-residue (or atom-atom) pairs in the non-interaction state.</p> <p>Results</p> <p>The study presents a new Distance- and Environment-dependent, Coarse-grained, Knowledge-based (DECK) potential for scoring of protein-protein docking predictions. Training sets of protein-protein matches were generated based on bound and unbound forms of proteins taken from the D<smcaps>OCKGROUND</smcaps> resource. Each residue was represented by a pseudo-atom in the geometric center of the side chain. To capture the long-range and the multi-body interactions, residues in different secondary structure elements at protein-protein interfaces were considered as different residue types. Five reference states for the potentials were defined and tested. The optimal reference state was selected and the cutoff effect on the distance-dependent potentials investigated. The potentials were validated on the docking decoys sets, showing better performance than the existing potentials used in scoring of protein-protein docking results.</p> <p>Conclusions</p> <p>A novel residue-based statistical potential for protein-protein docking was developed and validated on docking decoy sets. The results show that the scoring function DECK can successfully identify near-native protein-protein matches and thus is useful in protein docking. In addition to the practical application of the potentials, the study provides insights into the relative utility of the reference states, the scope of the distance dependence, and the coarse-graining of the potentials.</p

    Three-dimensional structure of β-cell-specific zinc transporter, ZnT-8, predicted from the type 2 diabetes-associated gene variant SLC30A8 R325W

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We examined the effects of the R325W mutation on the three-dimensional (3D) structure of the β-cell-specific Zn<sup>2+ </sup>(zinc) transporter ZnT-8.</p> <p>Methods</p> <p>A model of the C-terminal domain of the human ZnT-8 protein was generated by homology modeling based on the known crystal structure of the <it>Escherichia coli </it>(<it>E. coli</it>) zinc transporter YiiP at 3.8 Å resolution.</p> <p>Results</p> <p>The homodimer ZnT-8 protein structure exists as a Y-shaped architecture with Arg325 located at the ultimate bottom of this motif at approximately 13.5 Å from the transmembrane domain juncture. The C-terminal domain sequences of the human ZnT-8 protein and the <it>E. coli </it>zinc transporter YiiP share 12.3% identical and 39.5% homologous residues resulting in an overall homology of 51.8%. Validation statistics of the homology model showed a reasonable quality of the model. The C-terminal domain exhibited an αββαβ fold with Arg325 as the penultimate N-terminal residue of the α2-helix. The side chains of both Arg325 and Trp325 point away from the interface with the other monomer, whereas the ε-NH<sub>3</sub><sup>+ </sup>group of Arg325 is predicted to form an ionic interaction with the β-COO<sup>- </sup>group of Asp326 as well as Asp295. An amino acid alignment of the β2-α2 C-terminal loop domain revealed a variety of neutral amino acids at position 325 of different ZnT-8 proteins.</p> <p>Conclusions</p> <p>Our validated homology models predict that both Arg325 and Trp325, amino acids with a helix-forming behavior, and penultimate N-terminal residues in the α2-helix of the C-terminal domain, are shielded by the planar surface of the three cytoplasmic β-strands and hence unable to affect the sensing capacity of the C-terminal domain. Moreover, the amino acid residue at position 325 is too far removed from the docking and transporter parts of ZnT-8 to affect their local protein conformations. These data indicate that the inherited R325W abnormality in SLC30A8 may be tolerated and results in adequate zinc transfer to the correct sites in the pancreatic islet cells and are consistent with the observation that the <it>SLC30A8 </it>gene variant R325W has a low predicted value for future type 2 diabetes at population-based level.</p

    Solvent accessible surface area approximations for rapid and accurate protein structure prediction

    Get PDF
    The burial of hydrophobic amino acids in the protein core is a driving force in protein folding. The extent to which an amino acid interacts with the solvent and the protein core is naturally proportional to the surface area exposed to these environments. However, an accurate calculation of the solvent-accessible surface area (SASA), a geometric measure of this exposure, is numerically demanding as it is not pair-wise decomposable. Furthermore, it depends on a full-atom representation of the molecule. This manuscript introduces a series of four SASA approximations of increasing computational complexity and accuracy as well as knowledge-based environment free energy potentials based on these SASA approximations. Their ability to distinguish correctly from incorrectly folded protein models is assessed to balance speed and accuracy for protein structure prediction. We find the newly developed “Neighbor Vector” algorithm provides the most optimal balance of accurate yet rapid exposure measures

    Characterization of the wheat endosperm transfer cell-specific protein TaPR60

    No full text
    The TaPR60 gene from bread wheat encodes a small cysteine-rich protein with a hydrophobic signal peptide, predicted to direct the TaPR60 protein to a secretory pathway. It was demonstrated by heterologous expression of recombinant TaPR60 protein that the signal peptide is recognized and cleaved in yeast cells. The full-length gene including promoter sequence of a TaPR60 orthologue was cloned from a BAC library of Triticum durum. A transcriptional promoter-GUS fusion was stably transformed into wheat, barley and rice. The strongest GUS expression in wheat and barley was found in the endosperm transfer cells, while in rice the promoter was active inside the starchy endosperm during the early stages of grain filling. The TaPR60 gene was also used as bait in a yeast two-hybrid screen. Five proteins were identified in the screen, and for some of these prey proteins, the interaction was confirmed by co-immunoprecipitation. The signal peptide binding proteins, TaUbiL1 and TaUbiL2, are homologues of animal proteins, which belong to proteolytic complexes, and therefore may be responsible for TaPR60 processing or degradation of the signal peptide. Other proteins that interact with TaPR60 may have a function in TaPR60 secretion or regulation of this process. Examination of a three dimensional model of TaPR60 suggested that this protein could be involved in binding of lipidic molecules.Nataliya Kovalchuk, Jessica Smith, Margaret Pallotta, Rohan Singh, Ainur Ismagul, Serik Eliby, Natalia Bazanova, Andrew S. Milligan, Maria Hrmova and Peter Langridge, et al

    Modeling of Protein Tertiary and Quaternary Structures Based on Evolutionary Information

    No full text
    Proteins are subject to evolutionary forces that shape their three-dimensional structure to meet specific functional demands. The knowledge of the structure of a protein is therefore instrumental to gain information about the molecular basis of its function. However, experimental structure determination is inherently time consuming and expensive, making it impossible to follow the explosion of sequence data deriving from genome-scale projects. As a consequence, computational structural modeling techniques have received much attention and established themselves as a valuable complement to experimental structural biology efforts. Among these, comparative modeling remains the method of choice to model the three-dimensional structure of a protein when homology to a protein of known structure can be detected.The general strategy consists of using experimentally determined structures of proteins as templates for the generation of three-dimensional models of related family members (targets) of which the structure is unknown. This chapter provides a description of the individual steps needed to obtain a comparative model using SWISS-MODEL, one of the most widely used automated servers for protein structure homology modeling
    corecore